Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow building the image without needing to launch it #1647

Merged
merged 48 commits into from
Oct 10, 2023

Conversation

consideRatio
Copy link
Member

@consideRatio consideRatio commented Mar 24, 2023

(PR written and made by @GeorgianaElena, it was just initialized by @consideRatio to get a branch to reference)

This PR adds a feature that offers the option of an API only mode, and in this mode, it allows opting-out of launching directly after build.

It adds:

  • a traitlet called enable_api_only_node that defaults to False to preserve existing default behavior. When enabled, there will be no UI, with the only registered endpoints being:
    • /metrics
    • /versions
    • /build
    • /health
    • /_config
    • /* -> shows a custom 404 page
  • a query parameter called build_only to be passed to individual requests to enable the build only only when enable_api_only_node is True
flowchart LR
  subgraph API only mode enabled
     direction TB
       id1{{false}}
       id2{{false}}
       id3{{false}}
       id4{{true}}
       id5{{true}}
       id6{{true}}
  end

  subgraph `build_only` query param
     direction TB
     id1 ---|and|id11{{missing}}
     id2 ---|and|id12{{false}}
     id3 ---|and|id13{{true}}
     id4 ---|and|id14{{missing}}
     id5 ---|and|id15{{false}}
     id6 ---|and|id16{{true}}
  end

   subgraph Outcome
     direction TB
     id11 ==> id21{{OK, image will be launched after build}}
     id12 ==> id22{{OK, image will be launched after build}}
     id13 ==> id23{{ERROR, building but not launching is not permitted when API only mode is not enabled}}
     id14 ==> id24{{OK, image will be launched after build}}
     id15 ==> id25{{OK, image will be launched after build}}
     id16 ==> id26{{OK, image will NOT be launched after build}}
  end
  • tests
    • for the configuration corner cases that generate errors
    • for the actual build only config

It also updates the auth tests for consistency to expect a more descriptive pytest request parameter value in order for the app fixture to load the extra auth configuration. This is for consistency with the build only extra config that was added in this PR.

Todo

  • Document this once there's agreement on naming an evaluation of the the traitlet and query parameter values to come to an outcome.

Background

This is meant as a building block for the https://github.com/2i2c-org/binderhub-service project. In brief, its positioning itself to be relevant specifically for situations when a binderhub UI is relevant together with a JupyterHub behind authentication and that also provides user home folder storage.

Copy link
Collaborator

@yuvipanda yuvipanda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much for working on this, @GeorgianaElena!

In thinking about config, I was originally thinking about two traitlets and a query param, and made this table:

Scenario allow_no_launch traitlet require_no_launch traitlet launch=false query parameter
mybinder.org / public binders false false error is returned, saying 'building but not launching is not permitted'
binderhub used as a service to build images for a JupyterHub true true works as intended
Mysterious scenario where a single binderhub is used both to launch into a preconfigured JupyterHub but also as an API for just building images true false works as intended

However, upon looking at this table now, I think we don't really want to support row 3 - I don't see a good use case for that. So instead, I think the table should be:

Scenario require_no_launch traitlet launch=false query parameter
mybinder.org / public binders false error is returned, saying 'building but not launching is not permitted'
binderhub used as a service to build images for a JupyterHub true works as intended. If launch=false is not specified, return an error

So in effect, to deploy a binderhub that will not try to launch into an existing JupyterHub automatically, the following things must both happen:

  1. The binderhub Admin must set this traitlet config to True, stating their intention that they do not want to launch anywhere. This also allows them to skip the config for connecting a JupyterHub to this.
  2. The person building the frontend code that makes this API call must also set a query parameter, stating their intention to just get back a built image spec - and not want an automatic launch.

I think the current combination of traitlets and query parameters is close, but doesn't quite implement this. Currently, the query parameter just overrides the traitlet - so the image gets built and not launched if either the traitlet or the query param is true. For public facing binders like mybinder.org, I don't think we want to allow this behavior - we want to allow image building but not launching if both the traitlet and the query parameter is true. Switch from an or to an and.

I hope this makes sense! Let me know if it does not, and I can try clarify.

@GeorgianaElena
Copy link
Member

I think I understand @yuvipanda. Thanks for the explanation. Can you double check the following points to make sure I got it right, please?

Double check list

  1. When require_no_launch = False, then a mybinder scenario is considered, meaning that the query can be:
  • missing, in which case everything remains unchanged
  • launch=true, in which case everything remains unchanged
  • launch=false, in which case error is returned, saying 'building but not launching is not permitted'
  1. When require_no_launch = True, then a service-like scenario is considered, meaning that the query can be:
  • missing, in which case return an error
  • launch=true, in which case we launch?? (don't sure about this one. Do we still want an and situation herre also?)
  • launch=false, in which case we don't launch

Prior discussions

Currently, the query parameter just overrides the traitlet - so the image gets built and not launched if either the traitlet or the query param is true.

I believe @consideRatio made a point about why this (overriding through the query the traitlet) might be useful, but don't remember what it was 😬 @consideRatio wdyt?

Naming

Thinking more about the naming, I realize that having a traitlet with a negation in its name, and then the query without, it's kind of confusion. Also, checking if no require_no_launch is also confusing I believe.

What do you think about a:

  • traitlet called allow_build_only if we want to allow image building only if both the traitlet and the query parameter are try true
  • query param called build_only

@yuvipanda
Copy link
Collaborator

launch=true, in which case we launch?? (don't sure about this one. Do we still want an and situation herre also?)

I think in this case, we return an error - launch=false is required. This allows the binderhub to exist without requiring a JupyterHub it has launch rights on to be configured.

I like your proposed build based naming scheme, we can go with that. Thank you for thinking that through!

@GeorgianaElena
Copy link
Member

@yuvipanda, I've updated the PR if you want to take another look

Copy link
Member Author

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this @GeorgianaElena and @yuvipanda!!

binderhub/tests/test_build.py Outdated Show resolved Hide resolved
binderhub/builder.py Outdated Show resolved Hide resolved
binderhub/builder.py Outdated Show resolved Hide resolved
binderhub/builder.py Outdated Show resolved Hide resolved
binderhub/builder.py Outdated Show resolved Hide resolved
binderhub/builder.py Outdated Show resolved Hide resolved
binderhub/builder.py Outdated Show resolved Hide resolved
@GeorgianaElena GeorgianaElena marked this pull request as ready for review April 21, 2023 12:46
@GeorgianaElena
Copy link
Member

GeorgianaElena commented Apr 21, 2023

Thanks for the feedback @consideRatio ✨ it helped me simplify and correct the logic a bit more while also keeping the old log messages. I've also managed to get an understanding of the current tests and added some for the new functionality.

This is why I marked this as ready for review.

Note that it still lacks documentation but I plan to add that once there's agreement on the implementation.

P.S. thanks @colliand for showing off https://github.com/mermaid-js/mermaid in another PR. This was used to create the diagram in the top comment #1647 (comment) :D

@consideRatio
Copy link
Member Author

consideRatio commented Apr 24, 2023

Thank you for working this @GeorgianaElena!!! I'll look at this tomorrow/monday morning!

Copy link
Member Author

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work on this @GeorgianaElena!!!

binderhub/builder.py Outdated Show resolved Hide resolved
binderhub/app.py Outdated Show resolved Hide resolved
binderhub/tests/conftest.py Outdated Show resolved Hide resolved
binderhub/tests/test_build.py Outdated Show resolved Hide resolved
binderhub/tests/test_build.py Outdated Show resolved Hide resolved
binderhub/tests/test_build.py Outdated Show resolved Hide resolved
Copy link
Member Author

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me @GeorgianaElena!

With this well documented traitlet for an opt-in feature only to be used in certain circumstances, I consider this mergable as it is.

@GeorgianaElena
Copy link
Member

After testing this out with @consideRatio yesterday, we concluded that even though we decided that it is out of scope of this PR to support a new UI for the build only option, it is confusing to still have the classical build + launch available.

When clicking the launch button, that will imediatlly fail with:
Screenshot 2023-04-28 at 11 08 02

With the last commit, b82758e, when require_build_only is set, the classical UI is deactivated and instead a 204 No content page is rendered using the error template. It looks like below:
Screenshot 2023-04-28 at 10 41 01

Does something like this makes sense?

@GeorgianaElena
Copy link
Member

Failures seem to be happening on the main branch as well https://github.com/jupyterhub/binderhub/actions/runs/4829259002/jobs/8604077929 so not sure which if any are related to the PR

@consideRatio
Copy link
Member Author

@GeorgianaElena I've been debugging this for quite some time now, I'm drawing blank.

Apparently the hub pods launching are stuck pending without k8s Events associated with them. That indicates to me they aren't getting scheduled, not even failing to get scheduled to a node I think - but why?

@GeorgianaElena
Copy link
Member

Just added a commit to not register any handlers for / and /v2 when require_build_only is True instead to keep it simple and secure as per discussion with @consideRatio

Copy link
Member Author

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still brainstorming not confident on what I think is desirable

  • Do we want to de-register /about as well, and possibly /_config that I don't know what it does? And along with that perhaps deregistering everything about favicons and badges etc?
  • Should Custom404 respond with JSON instead if require_build_only is set?
  • If require_build_only pretty much is "api only mode", should we make api_only_mode a standalone configuration or similar? Hmmm, this is a jupyterhub thing as well i recall... Yes https://jupyterhub.readthedocs.io/en/latest/howto/api-only.html writes about an API Only mode.

binderhub/app.py Outdated Show resolved Hide resolved
@minrk minrk changed the title Opt-in feature to allow opting out of launching following build Opt-in feature to allow opting out of launch following build May 11, 2023
Copy link
Member Author

@consideRatio consideRatio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great to me!

I think that the prometheus metrics BUILD_TIME and BUILD_COUNT should be tracked still, let's resolve that and ask someone else to review this after that.

binderhub/builder.py Outdated Show resolved Hide resolved
binderhub/builder.py Outdated Show resolved Hide resolved
@GeorgianaElena
Copy link
Member

@minrk, I've added a few commits based on your feedback above (thank you),. I've implemented a combination of your suggestions above! What do you think about:

  • give the config a name that indicates how much is changing, e.g. build_only_api_only or disable_launch_and_ui
  • accept the only allowed value when unspecified in both cases, rather than requiring opt-in to the only accepted value (I’m less sure about this, especially in more detailed proposal below)
  • enable_api_only_mode config
    This introduces an API mode that only keeps the /build endpoint and a few other that are informative, like metrics, versions, health etc.

    Without additional config, this mode does not assume anything about a build only behavior

  • build_only query parameter

    This flag enables build only when it's set to True. In all other cases, the /build endpoint will do what it does by default, which is to build + launch.

    ⬆️ This is what I am unsure of. Does an api only mode only makes sense in a build only scenario?

Copy link
Member

@minrk minrk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for leaving you waiting so long, it's a busy Fall! I left some minor comments to address in naming and clarity, but I think this is solid as is, and feel free to merge and iterate on refinements in smaller later PRs.

Does an api only mode only makes sense in a build only scenario?

I don't think so. A separate service could use an api-only BinderHub to build and launch and then redirect to the running server. An example would be a service that uses BinderHub, but presents its own UI to the user before redirecting to a running Jupyter server. Much like BinderHub itself uses JupyterHub in api-only mode.

build_only_outcome = False
if build_only_query_parameter.lower() == "true":
if not enable_api_only_mode:
raise HTTPError(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lacks a status code. Presumably it should be 400?

@@ -228,6 +228,24 @@ def set_default_headers(self):
self.set_header("content-type", "text/event-stream")
self.set_header("cache-control", "no-cache")

def _get_build_only_outcome(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd call this _get_build_only(), since it's the dynamic resolution of the value of build_only with checks, etc., not a separate concept called 'outcome', which confused me a bit.

@@ -408,33 +426,52 @@ async def get(self, provider_prefix, _unescaped_spec):
else:
image_found = True

if image_found:
build_only_outcome = self._get_build_only_outcome()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, let's call it build_only, which I think is clearer than adding outcome

@GeorgianaElena
Copy link
Member

Thank you very much @minrk for the review and suggestions. I've added a commit that addresses them.

Some tests seem to be failing now though, but the same are failing for the main branch as well, so I don't think they are related to this PR?

@minrk
Copy link
Member

minrk commented Oct 10, 2023

Yes, failures appear to be helm related, not related to this PR (one image pull failure and one memory limit issue). Going to merge this one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants